Operating systems like Macintosh, Windows and Amiga often only support a character set of 256 characters, each allocated a system number from 0-255 (although some numbers are reserved for special printing/display-terminal functions like carriage return and line feed). The operating system associates a default, predefined character (or, 'glyph') to each displayable system number depending on the language being supported. (The glyphs for letters A-Z and a-z and the numerals 0-9 are allocated in the system number range of 48-112 along with some punctuation marks for operating systems intended to display this page.) The appearance, or font style, of the individual glyphs can be changed by selecting an alternative font style (e.g. 'Times'). However, some fonts do not have glyphs defined for all of the displayable operating system character numbers.
PageStream supports many more than 256 characters (glyphs) because it represents characters in document files by using the Unicode character set (an international standard with a potential of hundreds of thousands, even millions, of characters).
In order for Pagestream to display the document characters on the screen, send them to a printer or create a PDF file, it has to translate from the Unicode number in the file to the appropriate system number for the operating system and language being used. This applies to both printable and non-printable characters such as names of variables, style tags, and element names. If a dialogue box is used to change, for example, a style tag, Pagestream has to translate the characters selected in the dialogue box (and therefore conforming to the operating system character set) into Unicode before inserting them as non-printable characters into the document file. The same is true of printable text typed into Pagestream text frames and boxes.
Pagestream allows you to insert any Unicode character into a document (see, Inserting Special Characters ). However, if that character is not included in the operating system character set, then PageStream will do one of two things,
On installation, PageStream will use the default character set for that operating system and language support. To change the operating system character set used, select the required character set in the File/Preferences panel under the Type tab.
PageStream's supported character sets consist of built in support for Amiga, Macintosh, MS DOS, and Windows characters, with additional character sets stored in an ascii format in individual files in the SoftLogik/CSets folder.
Name in PageStream | file name | Comments |
Amiga | <built in> | Amiga |
Macintosh | <built in> | Macintosh |
Windows | <built in> | Windows |
MS-Dos | <built in> | MS DOS |
AmigaPL | AmigaPL.txt | Amiga Polish |
IBM CP1006 | CP1006.txt | |
Windows CP1250(EE) | WINCP1250.txt | |
Windows CP1251 | WINCP1251.txt | |
Windows CP1252 | WINCP1252.txt | |
Windows CP1253 | WINCP1253.txt | |
Windows CP1254 | WINCP1254.txt | |
Windows CP1255 | WINCP1255.txt | |
Windows CP1256 | WINCP1256.txt | |
Windows CP1257 | WINCP1257.txt | |
Windows CP1258 | WINCP1258.txt | |
IBM EBCDIC CP424 (Hebrew) | CP424.txt | |
HebrewPC CP856 | CP856.txt | |
Windows CP874 | WINCP874.txt | |
ISO 8859-01 | ISO 8859-1.txt | Western Europe |
ISO 8859-02 | ISO 8859-2.txt | Eastern Europe |
ISO 8859-03 | ISO 8859-3.txt | Southeastern Europe |
ISO 8859-04 | ISO 8859-4.txt | Northern Europe |
ISO 8859-05 | ISO 8859-5.txt | Cyrillic |
ISO 8859-06 | ISO 8859-6.txt | Arabic |
ISO 8859-07 | ISO 8859-7.txt | Greek |
ISO 8859-08 | ISO 8859-8.txt | Hebrew |
ISO 8859-09 | ISO 8859-9.txt | Western Europe & Turkish |
ISO 8859-10 | ISO 8859-10.txt | Baltic |
ISO 8859-13 | ISO 8859-13.txt | |
ISO 8859-14 | ISO 8859-14.txt | |
ISO 8859-15 | ISO 8859-15.txt | |
ISO 8859-16 | ISO 8859-16.txt | |
KOI8-R | KOI8-R.txt | |
Macintosh-CE | MacintoshCE | Central Europe |
If a character set is not supported by PageStream, please email details to support@grasshopperllc.com and support can be added for your character set. You may also create a new character set by making a copy of the closest existing character set file in the SoftLogik/CSets folder and modifying it. Save the file in the folder with an appropriate file name. If you do so, please share with other users by sending a copy to us!
In the future, character set support for JIS and other multibyte systems will be introduced. Customers interested in such support should contact us.
This is the format of the character set files for conversion from Unicode characters to the operating system character set. The file must be in ASCII format. The line endings does not matter, nor does the case of the commands. These files are stored in the CSet folder inside the SoftLogik folder, and new ones may be created using existing files as a template to start from. It is recommend that all of the commands be included as shown (but with the variables you want, if any) in every file.
APP LIBRARY CHARACTER SET v1.5 | This should be the first line in every character set file. |
NAME "CP1250" | The name of the character set as it is used in scripts. It must be unique to all the existing character sets, and it is recommended that it contain no spaces since spaces in script variables can be tricky to deal with. Normally the file name is the same as this, with a .txt extension, although they may differ, and the extension is not important to PageStream. |
GUINAME "Windows CP1250(EE)" | The name of the character set as it appears in the CharSet popup in Preferences-Type, as well as any other location that a character set may be specified (such as the ASCII text filter's import and export dialog box). |
BASESET "Windows" | The built-in character set starting point. It is not mandatory, but if given, then only those characters which are different from the built-in character set need be listed. As the first 128 characters are almost always the same, this saves some redundancy. Also, some character sets are very minor variations in the upper 128. If you want to use a BASESET, choose the one closest to the character set you are trying to support. It must be either Amiga, Macintosh, Windows, or MSDos. |
NEWLINETYPE CRLF | Defines what the host platform for this character set requires for a new line. Amiga character sets will normally have a newlinetype of LF. Macintosh character sets will normally have a newlinetype of CR. Windows (and MSDos) character sets will normally have a newlinetype of CRLF. The main function of the field is for the export of PageStream's newline into a text file by the ASCII filter. |
BEGIN | The start of the character map. |
# # This line is for information only |
Everything after the # symbol is treated as a comment and provides information for anyone reading the contents of the ASCII file. The comment is ignored by PageStream |
207 323 0x8e,0x00e9 0x2E 0x002E # FULL STOP etc... |
The operating system character number followed, after the comma separator, by the required Unicode character. This time the character numbers are given in Hex (HexaDecimal) format. The space is acting as a separator. The comment after the # explains that this defines the Full Stop punctuation mark character. You may find Unicode maps on the internet expressed in Hex rather than Decimal and it would be easier to use the Hex numerical format rather than convert to decimal. |
|